Protein arrays and uses thereof

ABSTRACT

Illustrative embodiments herein disclosed relate to protein arrays, methods for making the arrays and methods for using them, among others. In some embodiments known proteins representing at least 50% of the loci in the human genome are arrayed in known positions on a support. In some embodiments arrays are made of proteins purified from cell lysates by affinity binding to the support. In some embodiments protein arrays are used to decode the binding specificity of antibodies. In some embodiments protein arrays are used to diagnose auto-immune disorders. Many other embodiments and general features are disclosed.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority of U.S. ProvisionalApplication No. 61/245,852, filed Sep. 25, 2009, which is herebyincorporated by reference in its entirety.

FIELD

Embodiments herein disclosed relate to the field of protein arrays,methods for making protein arrays, and uses of the arrays for researchand to diagnose disease, among other things.

I

Protein arrays allow many protein-based assays to carried out inparallel. They provide greatly increased throughput compared toindividual assays. They typically use less reagent per assay and requireless time per assay as well. As a result, arrays generally providegreatly reduced costs per assay, even when the cost of fabricating thearrays is taken into account. In addition, simultaneous performance ofmultiple assays in arrays provides for redundancy of individual assaysand the ability to assay the same parameter in multiple ways, leading toimproved precision and accuracy of results, compared to individualassays. In addition, full genomic protein arrays offer possibilities fordetecting proteome wide protein-molecule interactions. Such genome widesurveys will be powerful tools for understanding protein-proteininteractions, decoding antibody binding specificities andcross-reactions, and for identifying biomarkers for diagnosis andpatient stratification, to name a few salient applications.

Several major formats of protein arrays have been described. In forwardphase protein micro-arrays capture proteins with well definedspecificities for particular targets are immobilized at definedlocations in an array and target compounds are identified and quantifiedby the positions and intensities of binding of a sample to the array.The primary use of forward phase arrays is to interrogate individualsamples to determine the presence and amount of a large number ofdifferent components simultaneously. In one typical type of forwardphase array, the array is made up of a panoply of antibodies specificfor particular antigens and the array is used to measure the presenceand amounts of these antigens in a sample.

In reverse phase arrays, a panoply of samples are arrayed and thenprobed with an identifying reagent, typically with a mono-specificreagent, such as an antibody specific for a particular antigen. Theprimary use of reverse phase arrays is to characterize a large number ofsamples for the presence and amount of one—or at most a few—components.An illustrative use of reverse phase arrays is to screen a series ofmono-specific reagents, such as antibodies specific for particularantigens, against a collection of different cell types.

A number of reviews on protein arrays have been published, whichdescribe types, uses, advantages and disadvantages of current proteinarrays technologies, including those of Joos and Bachmann (2009);“Protein microarrays: potentials and limitations,” Frontiers inBiosciences 14: 4376-4385; Chan at al. (2004); “Protein microarrays formultiplex analysis of signal transduction pathways,” Nature Medicine10(12): 1390-1396; Hartmann at al. (2009); “Protein microarrays fordiagnostic assays,” Anal. Bioanal. Chem. 393: 1407-1416, and Caron atal. (2007); “Cancer lmmunomics Using Autoantibody Signature forBiomarker Discovery,” Molecular & Cellular Proteomocis 6.7: 1115-1122.Additional references are provided under VII, below.

Previous protein arrays typically either comprised relatively smallnumbers of purified proteins or were made by reverse transfection oflarge numbers of previously arrayed DNAs into a host cell which, aftergrowth and expression of the transfected DNAs, were lysed in situ.Arrays in the former category have been limited by the number ofproteins that can be practically obtained; that is, by the difficultiesof protein purification that must be overcome for each protein in thearray. Arrays in the latter category have been limited by heterogeneityin transfection and expression results and by the limited proteindensity that can be obtained from confluent cells lysed on a surface insitu. For these reasons, and a variety of others, proteins arrayspresently available suffer from a variety of limitations anddisadvantages, and there is a need for improved protein arrays and forarrays that provide functionality not available with present technology.

II

The following numbered paragraphs are provided by way of illustrationand describe a few of the many aspects and embodiments of the inventionsherein disclosed. Many others are described herein and will be readilyapparent to those skilled in the arts to which they pertain. The use ofthe phrase “any of the foregoing or the following” in the numberedparagraphs indicates that the various elements set forth in the numberedparagraphs can be combined in any way, and it is used to provideexplicit support for any such combination. Applicant reserves the rightto set out explicitly and/or claim any one or more of the combinationsthus abbreviated, in whole or part by amendment in this or any successoror related application.

1.01. A method for making a protein array, comprising applying aplurality of cell lysates, comprising a corresponding plurality ofproteins to a corresponding plurality of positions on a support, whereinsaid plurality of proteins is expressed in said corresponding pluralityof cells via a corresponding plurality of exogenous DNAs,

1.02. A method for making a protein array, comprising applying lysatesL₁ through L_(n) comprising proteins P₁ through P_(n) to positions S₁through S_(n) on a support,

wherein each lysate L_(x) is of cells C_(x), comprising protein P_(x)expressed therein via exogenous DNA D_(x) and is applied to positionS_(x), wherein

P₁ through P_(n) are all different from one another,

S₁ through S_(n) are all different from one another,

n is an integer greater than 1 and

x is an integer from 1 to n

In embodiments, as set forth herein, x is a fraction of the genes, loci,or protein coding regions in a genome, particularly as set forthelsewhere herein. In embodiments, as set forth herein below, x is a setnumber of genes, loci or protein coding genes of an organism.

1.03. A method for making a protein array, comprising applying proteinsP₁ through P_(n) to positions S₁ through S_(n) on a support,

wherein each protein P_(x) is expressed in cells C_(x) via exogenous DNAD_(x) and is applied to position S_(x),

wherein

P₁ through P_(n) are all different from one another,

S₁ through S_(n) are all different from one another,

n is an integer greater than 1 and

x is an integer from 1 to n.

1.04. A method for making an array of proteins, comprising:

applying a plurality of lysates comprising a corresponding plurality ofproteins to a corresponding plurality of positions on a support, therebyproducing an array of said plurality of proteins,

wherein said lysates are produced by a method comprising expressing aplurality of proteins in a corresponding plurality of cell colonies orcultures via a corresponding plurality of exogenous DNAs in cells ofsaid colonies or cultures, and lysing each of said plurality of cellcolonies or cultures thereby to produce a corresponding plurality oflysates comprising said corresponding plurality of proteins.

1.05. A method for making an array of protein, comprising:

expressing a plurality of two or more proteins in a correspondingplurality of cell colonies or cultures via a corresponding plurality ofexogenous DNAs in cells of said colonies or cultures;

lysing each of said plurality of cell colonies or cultures thereby toproduce a corresponding plurality of lysates comprising saidcorresponding plurality of proteins;

applying said plurality of lysates comprising said correspondingplurality of proteins to a corresponding plurality of positions on asupport;

thereby producing an array of said plurality of proteins.

2.01. A method according to any of the foregoing or the following,wherein said proteins are over expressed via said DNAs in said cells.

2.02. A method according to any of the foregoing or the following,wherein said proteins are present at high concentrations in saidlysates.

2.03. A method according to any of the foregoing or the following,wherein at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% of saidproteins, other than controls, is at least any of 0.01, 0.025, 0.05,.0.10, 0.15, 0.25, 0.35, 0.50, .075, 1.00, 1.25, 1.50. 1.75, 2.00, 2.25,2.50, 2.75, 3.00, 3.50, 4.00, 4,50, 5.00, 5.50, 6.00, 7.50, 10.0, 15.0or 20 per cent of the total protein in said cells.

2.04. A method according to any of the foregoing or the following,wherein at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% of saidproteins, other than controls, is expressed in said cells via saidexogenous DNA in an amount that is substantially more than anyendogenous expression of said protein in said cells.

2.05. A method according to any of the foregoing or the following,wherein at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% of saidproteins, other than controls, is expressed via said DNAs in said cellsin an amount that is at least any of 1.5, 2.0, 2.5, 3.0, 4.0 5.0, 7.5,10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250, 300, 400, 500,750, 1,000, 1,500 or more times any endogenous expression of saidprotein in said cells.

2.06. A method according to any of the foregoing or the following,wherein at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% of saidproteins, other than controls, is at least any of 0.01, 0.025, 0.05,.0.10, 0.15, 0.25, 0.35, 0.50, .075, 1.00, 1.25, 1.50. 1.75, 2.00, 2.25,2.50, 2.75, 3.00, 3.50, 4.00, 4.50, 5.00, 5.50, 6.00, 7.50, 10.0, 15.0or 20 percent of the total protein in said lysates.

2.07. A method according to any of the foregoing or the following,wherein in at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% ofsaid lysates, other than controls, the concentration of each protein ineach lysate is at least any of 1, 2, 3, 5, 10, 15, 20, 25, 35, 50, 100,150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800 or 900 ug/ml or atleast any of 1, 2, 3, 5, 10, 15, 20, 25, 35, 50, 100, 150, 200, 250,300, 350, 400, 450, 500, 600, 700, 800 or 900 mg/ml.

3.01. A method according to any of the foregoing or the following,wherein the proteins collectively comprise, at least any of 20, 25, 30,35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 98 percent of theproteins encoded by a genome of an organism. In embodiments the organismis a mammal. In embodiments the organism is any one of a mouse, rat,sheep, goat, dog or primate.

3.02. A method according to any of the foregoing or the following,wherein the proteins collectively comprise at least any of 20, 25, 30,35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 98 percent of theproteins encoded by a human genome.

3.03. A method according to any of the foregoing or the following,wherein the array comprises proteins of at least any of 1,000, 2,000,3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000,13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21, 000,22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 30,000different loci of an organism.

3.04. A method according to any of the foregoing or the following,wherein the array comprises at least any of 1,000, 2,000, 3,000, 4,000,5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000,14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21, 000, 22,000,23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 30,000different proteins.

3.05. A method according to any of the foregoing or the following,wherein the array comprises at least any of 1,000, 2,000, 3,000, 4,000,5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000,14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21, 000, 22,000,23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 30,000positions to which said proteins have been applied.

3.06. A method according to any of the foregoing or the following,wherein there are at least any of 10, 25, 50, 75, 100, 150, 200, 250,350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500, 10,000,15,000, or 20,000 positions per cm².

3,07. A method according to any of the foregoing or the following,wherein there are at least any of 10, 25, 50, 75, 100, 150, 200, 250,350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500, 10,000,15,000, or 20,000 lysates applied per cm².

3.08. A method according to any of the foregoing or the following,wherein there are at least any of 10, 25, 50, 75, 100, 150, 200, 250,350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500, 10,000,15,000, or 20,000 of said proteins applied per cm².

3.09. A method according to any of the foregoing or the following,wherein there are at least any of 10, 25, 50, 75, 100, 150, 200, 250,350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500, 10,000,15,000, or 20,000 positions per cm² with said proteins applied to atleast any of 50, 60, 70, 80, 90 or 95% of said positions.

3.10. A method according to any of the foregoing or the following,wherein the spots are any of 10-50, 25-75, 50-100, 75-150 100-200,150-250, 200-300, 250-350, 300-400, 400-500, 500-750, 400-800, 750-1,000um in diameter.

3.11. A method according to any of the foregoing or the following,wherein the area of the features are any of 10-50, 25-75, 50-100, 75-150100-200, 150-250, 200-300, 250-350, 300-400, 400-500, 500-750, 400-800,750-1,250, 1,000-2,000, 1,500-3,000, 2,500-5,000 μm²

3.12. A method according to any of the foregoing or the following,wherein the center to center spacing of the features (spots) is any of5-15, 10-20, 15-25, 20-40, 25-50, 25-75, 50-100, 75-150, 100-150,125-175, 150-225, 200-250, 225-275, 250-350, 300-400 or 400-500 urn.

4.01. A method according to any of the foregoing or the following,wherein the concentration of proteins in each lysate is at least any of1, 2, 3, 5, 10, 15, 20, 25, 35, 50, 100, 150, 200, 250, 300, 350, 400,450, 500, 600, 700, 800 or 900 micrograms/mil or at least any of 1, 2,3, 5, 10, 15, 20, 25, 35, 50, 100, 150, 200, 250, 300, 350, 400, 450,500, 600, 700, 800 or 900 mg/ml.

4.02. A method according to any of the foregoing or the following,wherein for at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% ofsaid lysates, other than controls, the amount of lysate protein is theamount of total protein in at least any of 100, 150, 200, 250, 300, 350,400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000 ,1,500, 2,000 or 2,500 of the cells of the lysate.

4.03 A method according to any of the foregoing or the following,wherein for at least any of 50, 60, 75, 80, 85, 90, 95, 99 or 100% ofsaid proteins, other than controls, the amount of said protein expressedvia an exogenous DNA is at least any of the amount of said proteinexpressed via said exogenous DNA in 100, 150, 200, 250, 300, 350, 400,450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,500,2,000 or 2,500 cells in which the protein was expressed.

5.01. A method according to any of the foregoing or the following,wherein said cells are eukaryotic cells.

5.02. A method according to any of the foregoing or the following, wheresaid cells are prokaryotic cells.

5.03. A method according to any of the foregoing or the following,wherein the cells are any one or more of HEK293, COS, CV1, BHK, CHO,HeLa, LTK, or NIH 3T3 cells.

5.04. A method according to any of the foregoing or the following,wherein the cells are HEK293T cells.

6.01. A method according to any of the foregoing or the following,wherein one or more of said exogenous DNAs each encodes one of saidproteins.

6.02. A method according to any of the foregoing or the following,wherein one or more of said exogenous DNAs each encodes one of saidproteins and each such protein in each such exogenous DNAs is encoded bya cDNA, a genomic DNA, or a synthetic DNA.

6.03. A method according to any of the foregoing for the following,wherein one or more of said exogenous DNAs is an expression constructcomprising cis-acting elements effective for transcription in said cellsoperably linked to DNAs encoding one of said proteins. In embodimentsthe cis-acting elements include a promoter.

6.04. A method according to any of the foregoing for the following,wherein one or more of said exogenous DNAs is an expression constructcomprising a promoter (and, optionally, other cis-acting geneticelements) effective for transcription in said cells operably linked toDNAs encoding one of said proteins, wherein said DNA encoding saidprotein in each of said one or more exogenous DNAs is a cDNA, a genomicDNA or a synthetic DNA.

6.05. A method according to any of the foregoing or the following,wherein said promoter is any one or more of a CMV, SV40 or MMTVpromoter.

6.06 A method according to any of the foregoing or the following,wherein one or more of said exogenous DNAs encodes a chimeric proteincomprising substantially the amino acid sequence of a protein for saidarray fused in correct reading frame to a tag sequence effective for anyone or more of attachment, immobilization, capture, purification,detection and/or quantification.

6.07. A method according to any of the foregoing or the following,wherein said tag is any one or more of a GST, HA, V5, HIS, DDK (or FLAG)or myc tag.

6.08. A method according to any of the foregoing or the following,wherein said tag is a myc/FLAG tag.

6.09. A method according to any of the foregoing or the following,wherein said expression construct is a pCMV6-entry expression vector.

6.10. A method according to any of the foregoing or the following,wherein said exogenous DNA comprises a construct for non-homologousrecombinatorial activation of expression of an endogenous gene encodinga protein for said protein array.

6.11. A method according to any of the foregoing or the following,wherein said exogenous DNA comprises a construct for homologousrecombinatorial activation of expression of an endogenous gene encodinga protein for said array.

7.01. A method according to any of the foregoing or the followingwherein the support is any support described or listed elsewhere herein.

7.02. A method according to any of the foregoing or the following,wherein the support comprises nitrocellulose.

7.03. A method according to any of the foregoing or the following,wherein the support comprises a nitrocellulose coated glass slide.

8.01 A method according to any of the foregoing or the following,wherein the proteins are purified from the lysates prior to applicationto the support.

8.02 A method according to any of the foregoing or the following,wherein the proteins are purified prior to application to the support toat least any of 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%pure.

8.03. A method according to any of the foregoing or the following,wherein the proteins are purified prior to application to the support soas to at least any of 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%,99% of the total protein applied to the support.

8.04. A method according to any of the foregoing or the following,wherein the proteins are purified by affinity chromatography prior toapplication to the support .

8.05. A method according to any of the foregoing or the following,wherein the proteins comprise an affinity tag and are purified prior toapplication to the support by affinity chromatography specific for theaffinity tag.

8.06. A method according to any of the foregoing or the following,wherein the proteins comprise a peptide affinity tag and are purifiedprior to application to the support by affinity chromatography specificfor the peptide affinity tag.

8.07. A method according to any of the foregoing or the following,wherein, the proteins comprises a DDK affinity tag and are purifiedprior to application to the support by immunoaffinity chromatographyusing an antibody specific for the DDK tag.

9.01 A method according to any of the foregoing or the following,wherein the proteins are purified from the lysates following applicationto the support.

9.02 A method according to any of the foregoing or the following,wherein the proteins are purified following application to the supportto at least any of 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%,. 98%,99% or more of homogeneously pure.

9.03. A method according to any of the foregoing or the following,wherein the proteins are purified following application to the supportso as to be at least any 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%,.98%, 99% or the total protein immobilized on the support.

9.04. A method according to any of the foregoing or the following,wherein the proteins are purified following application to the supportby binding to an affinity moiety on the support specific to the proteinsexpressed via the exogenous DNA, and removing unbound material.

9.05. A method according to any of the foregoing or the following,wherein the proteins comprise an affinity tag and are purified followingapplication to the support by binding to an affinity moiety on thesupport specific for the tag, and removing unbound material. 9.06. Amethod according to any of the foregoing or the following, wherein theproteins comprise a peptide affinity tag and are purified followingapplication to the support by binding to an affinity moiety specific forthe peptide tag and removing unbound material.

9.07. A method according to any of the foregoing or the following,wherein, the proteins comprises a DDK affinity tag and are purifiedfollowing application to the support by binding to an affinity moietyspecific for the DDK affinity tag and removing unbound material. Inembodiments the DDK specific affinity tag is a DDK-specific antibody.

10.01. A protein array according to any of the foregoing or thefollowing, made by any of the foregoing methods.

10.02. A protein array according to any of the foregoing or thefollowing comprising at least any of 1,000, 2,000, 3,000, 4,000, 5,000,6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000,15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21, 000, 22,000, 23,000,24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 30,000 differentproteins.

10.03. A protein array according to any of the foregoing or thefollowing, comprising at least any of 1,000, 2,000, 3,000, 4,000, 5,000,6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000,15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21, 000, 22,000, 23,000,24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 30,000 different lociof a genome of an organism.

10.04. A protein array according to any of the foregoing or thefollowing, comprising at least any of 20, 25, 30, 35, 40, 45, 50, 55,60, 65, 70, 75, 80, 85, 90, 95, or 98 percent of the proteins encoded bya genome of an organism.

10.05. A protein array according to any of the foregoing or thefollowing, wherein there are at least any of 10, 25, 50, 75, 100, 150,200, 250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000,7,500, 10,000, 15,000, or 20,000 lysates applied per cm².

10.06. A protein array according to any of the foregoing or thefollowing, wherein there are at least any of 10, 25, 50, 75, 100, 150,200, 250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000,7,500, 10,000, 15,000, or 20,000 of said proteins expressed via saidexogenous DNAs per cm².

10.07. A protein array according to any of the foregoing or thefollowing, wherein at least any of 25, 50, 60, 75, 80, 85, 90, 95, 99 or100% of the positions to which said proteins are applied the amount ofsaid protein expressed via said exogenous DNA is at least any of theamount of said protein expressed via said exogenous DNA in 100, 150,200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850,900, 950, 1,000, 1,500, 2,000 or 2,500 cells in which each said proteinwas expressed.

10.08. A protein array according to any of the foregoing or thefollowing, comprising alignment markers.

11.01. A method of determining the anti-body specificity and/or crossreaction of one or more antibodies to proteins, comprising contacting anantibody with a protein array according to any of the foregoing or thefollowing and determining binding of the antibody thereto. Inembodiments the protein array comprises a substantial fraction of allthe proteins encoded by a genome in accordance with any of the foregoingor the following.

11.02. A method for determining one or more specificities and/or one ormore cross reactivities of an antibody preparation, comprisingcontacting an antibody preparation with a protein arrays in accordancewith any of the foregoing or the following and determining binding ofantibodies in the preparation thereto. In embodiments the antibodypreparation is a whole cell anti-serum. In embodiments the protein arraycomprises a substantial fraction of the proteins encoded by a genome inaccordance with any of the foregoing or the following.

11.03. A method for determining the binding specificity of an antibodyor an antibody preparation comprising determining the binding of theantibody or antibody preparation to a protein array in accordance withany of the foregoing or the following and from the binding to the arraythus determined identifying the proteins specifically bound thereby. Inembodiments the protein array comprises a substantial fraction of theproteins encoded by a genome in accordance with any of the foregoing orthe following.

11.04. A method for determining protein biomarkers of disease,comprising determining binding of samples from one or more healthyindividuals and from one or more diseased individuals suffering from adisease to a protein array in accordance with any of the foregoing orthe following, and from differences in the binding of the samples fromthe healthy and diseased individuals determining protein biomarkers ofthe disease. In embodiments the protein array comprises a substantialfraction of the proteins encoded by a genome in accordance with any ofthe foregoing or the following.

11.05. A method for determining biomarkers of an autoimmune disease,comprising determining binding to a protein array in accordance with anyof the foregoing or the following of antibody-containing samples fromone or more healthy subjects and from one or more subjects sufferingfrom an autoimmune disease, and from differences in the binding of theantibodies in the samples from the healthy subjects and the subjectssuffering from an autoimmune disease determining protein biomarkers ofthe autoimmune disease. In embodiments the protein array comprises asubstantial fraction of the proteins encoded by a genome in accordancewith any of the foregoing or the following.

11.06. A method for determining biomarkers of a disease characterized bythe presence of antibodies not present in healthy individuals,comprising determining binding to a protein array in accordance with anyof the foregoing or the following of samples from one or more healthysubjects and from one or more subjects suffering from a diseasecharacterized by the presence of antibodies not present in healthyindividuals, and from differences in the binding of the antibodies inthe samples from the healthy subjects and the subjects suffering fromthe disease determining protein biomarkers of the disease. Inembodiments the protein array comprises a substantial fraction of theproteins encoded by a genome in accordance with any of the foregoing orthe following.

11.07. A method for diagnosing a disease characterized by the presenceof antibodies not present in healthy individuals, comprising determiningbinding to a protein array in accordance with any of the foregoing orthe following of an antibody containing sample from a subject possiblysuffering from the disease and from the binding of antibodies in thesample to the array determining the absence or the presence of thedisease. In embodiments the protein array comprises a substantialfraction of the proteins encoded by a genome in accordance with any ofthe foregoing or the following.

11.08. A method for monitoring signaling transduction pathways,comprising determining binding to a protein array in accordance with anyof the foregoing or the following of a sample comprising proteins ofsignal transduction pathway proteins, whereby said binding is indicativeof the absence, the presence and/or the amount of said proteins. Inembodiments the sample is a whole cell lysate. In examples the samplecomprises cells in which protein expression via an exogenous DNA canalter the proteins of said signal transduction pathway. In embodimentschanges in any one or more of abundance, post-translationalmodification, or stability of said proteins is monitored. In embodimentsbinding of the proteins is detected using one or more protein-specificantibodies. In embodiments binding to the proteins arrays is used todecode functional connections between proteins expressed via anexogenous DNA and endogenous proteins of one or more signal transductionpathways. In embodiments several determinations are made in successionand changes in the status of proteins in one more signal transductionpathways are monitored.

11.09. A method for determining interactions between small molecules andproteins, comprising determining the binding to protein arrays inaccordance with any or the foregoing or the following of a samplecomprising said small molecules. In embodiments the small molecular areany one or more of small organic molecules, fats, fatty acids, fattyacid esters, lipids, sugars, glycans, nucleic acids, polynucleotides,amino acids, peptides or polypeptides, or any other small molecules. Inembodiments the small molecules are detectably labeled with a detectablelabel. In embodiment binding of the small molecules is detected using asecondary agent that binds to small molecules bound to the array.

III

Words, terms and phrases generally are used herein in accordance withtheir ordinary meanings to those skilled in the arts to which theypertain, except as may be defined otherwise herein. For clarity,illustrative explanations of certain terms and phrases are set forthbelow. These illustrative explanations are set out exclusively as an aidto understanding the inventions herein described, and they are notlimitative of the invention, and should not be understood to undulylimit the invention in any way.

Lysate L₁ through L_(n) designates a plurality of n lysates, numberedconsecutively 1 through n, where n is at least 2. Some of the lysatesmay be the same or they may all be different.

Cells (generally a population of cells) C₁ through C_(n) designates aplurality of n cells (generally n cell populations), numberedconsecutively 1 through n, where n is at least 2. Some of the cells maybe the same or they may all be different.

Proteins P₁ through P_(n) designates a plurality of n proteins, numberedconsecutively 1 through n, where n is a least 2. Some of the proteinsmay be the same or they may all be different.

Positions S₁ through S_(n) designates a plurality of n positions(generally in an array) numbered consecutively 1 through n, where n isat least 2. Some of the proteins may be the same or they may all bedifferent. The identities of some or all of the proteins may be know orunknown.

DNAs D₁ through D_(n) designates a plurality of n DNAs, numberedconsecutively 1 through n, where n is at least 2. Some of the DNAs maybe the same or they may all be different. The identities of some or allof the DNAs may be known or unknown.

The terms “respectively” (and “corresponding”) used with thesedesignations means a correspondence between them. For instance, lysatesL₁ through L_(n) of cells C₁ through C_(n) expressing proteins P,through P_(n), via DNAs D₁ through D_(n), at positions S₁ through S_(n)respectively means lysate L₁ of cells C₁ expressing protein P₁ via DNAD₁ at position S₁, lysate L₂ of cells C₂ expressing protein P₂ via DNAD₂ at position S₂, and so on through lysate L_(n) of cells C_(n)expressing protein P_(n) via DNA D_(n) at position S_(n).

Antibody as used herein includes polyclonal and monoclonal antibodiesand derivatives thereof, including but not limited to the following:F(ab)₂ and F(ab) fragments, including fragments of the following; hybrid(chimeric) antibody molecules, as described in for example Winter et al.(1991) Nature 349:293-299 and U.S. Pat. No. 4,816,567); Fv molecules(non-covalent heterodimers) as described in for example Inbar et al.(1972) Proc Natl Acad Sci USA 69:2659-2662 and Ehrlich et al. (1980)Biochem 19:4091-4096); single-chain Fv molecules (sFv) as described infor example Huston et al. (1988) Proc Natl Acad Sci USA 85:5879-5883;dimeric and trimeric antibody fragment constructs; minibodies, asdescribed in for example Pack et al. (1992) Biochem 31:1579-1584 andCumber et al. (1992) J. immunology 149B:120-126; humanized antibodymolecules, as described for example in Riechmann et al. (1988) Nature332:323-327; Verhoeyan et al. (1988) Science 239:1534-1536; and U.K.Patent Publication No. GB 2,276,169, published Sep. 21, 1994; and, anyfunctional fragments obtained from such molecules, such as fragmentsthat retain antigen binding properties.

Antigen is used herein broadly to indicate any agent which elicits animmune response in the body, typically by binding to an antibody T-cellreceptor or other antigen binding an antibody, T-cell receptor or otherantigen binding immune system molecule. An antigen typically has one ormore epitopes.

Array as used herein generally refers to an ordered arrangement ofdiscrete positions. A protein array typically is an ordered arrangementof proteins in discrete positions. Often proteins arrays comprise a setof discrete positions on a surface with proteins disposed at one or moreof the positions. Typically, the positions, particularly those withproteins disposed therein, are at known locations in the array, and thepositions typically have a spatial address, such a 2-dimensionaldenomination, akin to an x,y coordinate in a two dimensional Cartesiancoordinate system. Obviously, arrays can be made in any desired geometryand other addressing schemes can be employed to denote the uniquelocations of positions and/or of proteins in an array. In embodimentssome or all of the proteins in an array are known proteins.

DNA is used herein to denote polydeoxyribonulcleotides, includingmodified forms of naturally occurring DNAs, such as DNAs with unusualbases, incorporating labels, or chemically modified DNAs. While many ofthe examples and illustrations herein are written in terms of DNA, otherpolynucleotides can be used in much the same way, such as RNAs.Moreover, when RNA is introduced into a host cell typically it isconverted to DNA and the phrase expressed via an exogenous DNA thusincludes expression resulting from the introduction of RNA.

DDK is used herein to denote a peptide tag, commercially known as FLAG.The terms DDK and FLAG are used interchangeably herein.

Epitopes are individual specific features (such as structural features)of an antigen that are recognized (bound) by an antibody. Antigenscomprise one or more epitopes. Different antibodies may bind the same ordifferent epitopes on a given antigen. The epitopes on a protein antigenmay be defined by continuous or discontinuous portions of the amino acidsequence.

Recombinant protein is used herein to mean a protein produced usingmolecular cloning techniques, such as a protein expressed via anexogenous polynucleotide, such as an exogenous DNA. As discussed ingreater detail elsewhere herein, expression of a protein via anexogenous polynucleotide can be engendered by introducing into a hostcell a polynucleotide encoding the protein or by introducing into a hostcell a polynucleotide that engenders increased expression of anendogenous gene, such as by promoter activation, or by other methods,

Specific binding-partner as used herein indicates an agent that bindsspecifically to a target. Specific binding indicates that the agent candistinguish a target, such as an antigen, or an epitope within anantigen, from other non-target substances. An antibody specific for anantigen and the antigen are an example of specific binding partners. Aspecific binding partner is specific in the sense that it can be used todetect a target above background noise, typically a function ofnon-specific binding. For example, a specific binding partner of aprotein can detect a specific feature such as a sequence or atopological conformation of the protein. A specific feature can be forinstance a defined order of amino acids or a defined chemical moiety.For instance, an antibody that binds to a protein specifically may bespecific for a short amino acid sequence of a protein, it may bespecific for a specific amino acid modification, such as phosphorylationof tyrosine (phosphotyrosine), or it may be specific for a particularcarbohydrate configuration (glycan structure) in the protein, amongothers.

Support is used broadly herein to mean a surface-providing structure towhich proteins may be applied to form an array. Typically a support issolid and structurally stable to the manipulations required to make thearray and to use it. Support can have one or more components, such as aglass slide for solidity and a nitrocellulose “pad” for immobilizingproteins in an array.

IV BRIEF DESCRIPTIONS OF THE FIGURES AND TABLES

Table 1 is a schematic diagram showing a general method for makingarrays in accordance with various embodiments of the inventions hereindescribed.

FIG. 1 is a schematic diagram showing a modular array layout, with anenlarged view of a subarray illustrating the layout of duplicate samplesand controls.

FIG. 2 shows a protein array of 3720 lysates spotted in duplicate (7500spots in all) on a Schott nitrocellulose coated glass support slide. (A)shows the array after staining with colloidal gold to visualize totalprotein. (B) shows the array after immunostaining anti-FLAG antibody tovisualize in each lysate the protein expressed via the exogenous DNA.

FIG. 3 is a schematic diagram of a pCMV6-entry expression vector forexpressing proteins in cells via an exogenous DNA. The diagram showsmajor functionalities of the vector, including the CMV promoter forstrong transcription in eukaryotic cells, the SV40 origin forreplication in eukaryotic cells, a DDK-myc tag encoding region, regionswith multiple restriction sites for cloning, kanamycin/neomycinresistance genes to confer antibiotic resistance in prokaryotes andeukaryotes, respectively, polyadenylation signals for transcriptpolyadenylation in eukaryotes and an fi bacterial origin of replicationfor DNA replication in prokaryotes.

FIG. 4 shows the specificity of binding of a characterized anit-p53antibody to a protein array of 3720 lysates spotted in duplicate on aSchott nitrocellulose coated glass support slide. (A) shows the arrayafter immunostaining with anti-FLAG antibody to visualize the proteinexpressed via the exogenous DNA in each lysate. (B) shows the arrayafter immunostaining with the anti-p53 antibody. There is one positivelysate, highlighted by an arrow. No cross reaction was detected. (C)shows an enlarged image of the section of the array containing the spotsbinding to the p53 antibody. The upper panel in (C) shows anit-FLAGimmunostaining in the enlarged area, including serial dilutions ofcontrol proteins. The lower panel in (C) shows the reaction of theduplicate spots binding the p53 antibody (highlighted by the arrow).

FIG. 5 illustrates the use of a protein array to decode a monoclonalantibody generated by whole cell immunization. (A) shows immunostainingof the array with the monoclonal antibody. A positive signal isindicated by the dashed box and the arrow. The inset shows the positivearea at high magnification, with the duplicate E-Cadhedrin I positivesignal highlighted an arrow. (B) shows the results of a Western Blotanalysis confirming specificity of the monoclonal antibody forE-Cadhedrin I.

FIG. 6 illustrates the identification of breast cancer biomarkers usinga protein lysate array.

The left panel shows the results of immunostaining the array with serumfrom a breast cancer patient. Positively reacting areas are set off indashed boxes, highlighted by arrows, and shown enlarged in enlargedareas A, B and C.

The right panel shows control immunostaining with normal control serum.Enlarged areas A, B and C correspond to enlarged areas A, B and C in theleft panel show. The control serum does not immunostain the positionsimmunostained by serum from the breast cancer patient; but, reaction ofauto-antibodies in the control serum can be seen in C at differentpositions of the array.

TABLE 2 is a schematic diagram of an embodiment for making arrays usingproteins purified from lysates. In embodiments the proteins are taggedwith DDK epitopes and are purified by immunoaffinity using ant-DDKantibodies.

FIG. 7 shows homogeneity by SDS-PAGE of ten myc-FLAG (DDK) taggedproteins purified from 10 randomly chosen whole cell lysates by highthroughput immunoaffinity purification using an anti-DDK antibody. TABLE3 is a schematic diagram of an embodiment for making arrays usingin-situ purification of proteins on the support.

FIG. 8 is a schematic diagram of an embodiment for making arrays using astep of on-support purification, in which FLAG-tagged proteins areimmobilized on an anti-FLAG coated support and other proteins are washedaway, producing an array of purified proteins.

FIG. 9 shows on support immunoaffinity purification of FLAG taggedproteins from lysates on an anti-FLAG antibody coated nitorcellulosesupport. (A) shows a small area of an array made on an anti-FLAGantibody coated support. (B) shows a small area of a matching array madewithout the anti-FLAG antibody coating. In both (A) and (B) the upperinsets show immunostaining with anti-myc antibody to visualize mycspotted in this part of the array, and the lower insets showimmunostaining with anti-beta actin antibody indicative of(non-specific) binding of proteins that do not comprise the FLAG tag.

TABLE 1

TABLE 2

TABLE 3

V

Embodiments of the invention herein described provide, among otherthings, protein arrays, methods for making protein arrays, methods forusing protein arrays and devices that incorporate protein arrays.Certain embodiments provide methods to determine protein-proteininteractions, verify antibody specificity and identify cross-reactingspecies, decode antibody specificities identify biomarkers, diagnosedisease, and stratify patient populations, among other things.

As further illustrated herein in embodiments the proteins in the arraysare comprised in cell lysates. In embodiments, the lysates are made fromcells in which the proteins are expressed via an exogenous DNA. Inembodiments lysates are made from cells that over-express proteins viathe exogenous DNA. In embodiments lysates comprising differentover-expressed proteins are applied to specific positions in an array,such that the identity of the proteins is known by their positions inthe array. In various embodiments the structure and/or the function ofover-expressed proteins may or may not be known. In embodimentsexogenous DNAs activate over-expression of an endogenous gene. Inembodiments exogenous DNAs encode proteins. In embodiments exogenousDNAs comprise cDNAs and engender over production of the proteins thecDNAs encode. In embodiments the cells are mammalian cells. Inembodiments the cells are human cells. In embodiments the proteins aremammalian proteins. In embodiments the proteins are human proteins. Inembodiments arrays comprises a defined number of genes. In embodimentsarrays comprises a defined fraction of the proteins encoded by a genome,such as the human genome.

In embodiments, proteins are purified from the lysates prior toimmobilization. In embodiments the proteins are fusion proteinscomprising an affinity tag and are purified by immunoaffinitypurification and then immobilized in the array. In embodiments theproteins comprise a FLAG affinity tag and are purified by immunoaffinityusing an anti-FLAG antibody.

In embodiments, proteins are purified from lysates in situ afterapplication by binding to an affinity reagent coated support. Inembodiments proteins are fusion proteins comprise an affinity tag thatbinds to an affinity reagent and the proteins are specifically bound toa support coated with the affinity reagent, by the interaction betweenthe affinity tag and the affinity reagent, and unbound proteins in thelysate are removed from the support. In embodiments the affinity tag isa FLAG tag and the affinity reagent is an anti-FLAG antibody.

In embodiments protein arrays are used to determine protein-proteininteractions. in embodiment arrays are used to identify, to determineand/or to quantify proteins to which a protein binds specifically and/orwith which it cross-reacts.

In embodiments proteins arrays are used to determine the specificityand/or the cross-reactivity of antibodies, including, among others,polyclonal and monoclonal antibodies, and antibody derivatives. Inembodiments arrays are used to decode antibodies, that is, to identifythe proteins to which an antibody binds, such as, in particular, whenthat protein is not known.

In embodiments protein arrays are used to determine the proteins towhich antibodies in autoimmune sera bind. In embodiments the autoimmunediseases are any one or more Lupus, RA or MS.

In embodiments protein arrays are used to identify autoimmune markers ofhealth and/or disease. In embodiments, protein arrays are used todetermine autoimmune markers of heath and/or disease, in embodiments theautoimmune markers are markers for autoimmune diseases or cancers. Inembodiments the autoimmune diseases are any one or more Lupus, RA or MS.

In embodiments protein arrays are used to determine the binding ofnon-protein substances to proteins. In embodiments protein arrays areused to identify the protein binding partners of non-protein substances.

Array Making Methods

In embodiments the arrays are formed by, for each protein of apopulation of proteins, preparing a cell lysate comprising the proteinand applying the lysate to a position on a support, wherein the lysatefor each protein is applied to a different position, the application ofthe lysates forms an array of the proteins on the support, and theprotein is expressed via an exogenous DNA in the cells from which thelysates are made. FIG. 1 shows a general scheme for making arrays fromcell lysates in accordance with embodiments of the invention. FIG. 1shows a two dimensional protein array made in accordance with thegeneral method of embodiments set forth in Table 1. Table 2 shows amethod for making arrays in accordance with an embodiment whereinproteins are purified from lysates prior to forming the array. Table 3shows a method for making arrays in accordance with an embodimentwherein proteins are purified after forming the array by an in situaffinity method.

In aspects of the inventions herein described, embodiments relate toarrays that encompass a substantial fraction of the proteins expressedby genes in a genome of an organism. Certain embodiments moreover relateto arrays in which the proteins are over-expressed via an exogenous DNAin a host cell prior to application to the support. Certain furtherembodiments relate to arrays in which the concentration of proteins inthe array is higher than in host cells in which it is expressed, and inembodiments to arrays in which the concentration of proteins in thearray is higher than it is in host cells in which it is expressed via anexogenous DNA.

Methods for making arrays in embodiments, as illustrated herein, canemploy any suitable methods for expressing proteins in cells viaexogenous DNAs (or other polynucleotides), lysing the cells and applyingthe lysates (or proteins purified there from) to a support to form anarray. Methods for expressing proteins, making the lysates, and applyingthe lysates (or purified proteins) to supports to form arrays inaccordance with various illustrative embodiments are described ingreater detail below, and further illustrated in the Examples.

Proteins for Arrays Proteins for arrays in embodiments of the inventionherein described are obtained via exogenous polynucleotides in hostcells, often DNAs. In embodiment the host cells are eukaryotic cells. Inembodiments the cells are mammalian. In embodiments they are humancells, as further discussed elsewhere herein. The polynucleotides, suchas DNAs for expressing the proteins in the host cells in embodiments aremembers of a library. In this regard the term library means a collectionor set of polynucleotides, such as DNAs. In embodiments libraries maycomprise a defined number of unique loci, genes, protein coding genes orregions, open reading frames or the like of a genome, such as amammalian genome, such as a mouse, rat, goat, sheep, pig, cow, horse,monkey, gorilla or human genome.

In this regard, a “locus” or “gene locus” refers to a distinct positionon a chromosome. A gene locus is precisely mapped by nucleotide sequenceto a defined chromosomal region within the genome that includes allpossible exons that can be spliced together. More than one transcriptcan originate from a single genomic locus because of alternative exonusage and/or differential splicing. Thus, each unique gene locus can berepresented by multiple expression clones, each containing apolynucleotide for a different transcript originating from the sameunique gene locus. By proteins comprising loci as used herein is meantthat the proteins correspond to loci, that they are encoded there.

The total number of genes, protein-coding genes, unique loci and thelike in the human genome is a matter of on-going research. For instance,see Nature 431, 931-945 (21 Oct. 2004) and other articles in that issuewhich describe the number of human genes. The NCBI maintains acomprehensive, integrated, non-redundant set of nucleotide sequencesfrom the human genome referred to herein as the Reference SequenceCollection (“RefSeq”). The collection, which is meticulously curated andcontinually updated, is described in, for instance, Pruitt K D, Katz KS, Sicotte H, Maglott D R, Trends Genet. 2000 January;16(1):44-47;Pruitt K D, Maglott D R, Nucleic Acids Res 2001 January 1;29(1):137-140;The NCBI handbook [Internet]. Bethesda (Md.): National Library ofMedicine (US), National Center for Biotechnology Information, 2002 Oct.Chapter 17, The Reference Sequence (RefSeq) Project (available viahttp://ncbi.nlm.nih.gov/entrez). Sequences of polynucleotides forexpressing proteins, as well as numbers of loci and genes, can belocated in RefSeq and in other data bases such as GenBank, SwissProt,GenSeq, EMBL, UniProt, ASD, IMGT, IPD, IPI.

In embodiments such libraries comprise a substantial portion of theunique loci in the genome, such as at least any of 5, 10, 20, 25, 30,35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 97, 99% (and valuesin between) of the unique loci in a genome, such as the human genome. Inembodiments such libraries may comprise a substantial portion of genesin the genome, such as at least any of 20, 25, 30, 35, 40, 45, 50, 55,60, 65, 70, 75, 80, 85, 90, 95, 97, 99% or more (and values in between)of the genes in a genome, such as the human genome. In embodiments suchlibraries may comprise a substantial portion of genes in the genome,such as at least any of 5, 10, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65,70, 75, 80, 85, 90, 95, 97, 99% or more (and values in between) of theprotein-coding genes in genome, such as the human genome.

In embodiments such libraries comprise polynucleotides, such as DNAs,for a specified number of unique loci, genes, protein coding genes, openreading frames or the like, such as at least any of 5,000, 6,000, 7,000,8,000, 9,.000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000,17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 24,000, 26,000, 28,000,30,000, 35,000, 40,000 loci, genes. protein coding genes, open readingframes and the like.

Subarrays

In embodiments protein arrays are organized into subarrays based onparticular, general proteins features, such as functional or structuralfeatures, when a protein is expressed during the cell cycle or duringthe development, or where it is expressed in an organism, or where it islocated in cells, its involvement in a given metabolic pathway, itsrelationship to a disease, etc. In embodiments arrays and/or subarraysmay group proteins that are related by involvement in a specificdisease. In embodiments arrays and/or subarrays may group proteins suchas transmembrane (plasma membrane); G-protein coupled receptors;G-protein coupled receptors, non-olfactory; G-protein coupled receptors,olfactory; hormone receptors; steroid hormone receptors;neurotransmitter receptors; enzymes; kinases; cytoplasmic; organellar,nuclear; nuclear membrane; endoplasmic reticulum; mitochondrial;lysosomal; cytoskeleton; immune system; tissue type (e.g., breast,prostate, brain, heart, etc); ion channels; nuclear hormone receptors,cytochrome P450; phosphatases; proteases; phosphodiesterases; proteintrafficking; ATP-binding cassette (ANC); cytokines; homeobox and HOXgenes; integrins; transporters; DexH/D protein family (RNA metabolism),etc.

Protein Production via Exogenous DNAs

In embodiments proteins are expressed in cells from which the lysatesare made via an exogenous DNA, which is to say that the amount of theprotein in the cells—and thereby in the lysates—is engenderedsubstantially by the exogenous DNA. In embodiments the protein isover-expressed in the cells via the exogenous DNA, by which is meantthat the protein is produced in the cells in excess of the amount thecells would produce were it not for the presence and action of theexogenous DNA. In embodiments the protein is produced endogenously incells; but, it is produced at distinguishably higher levels via thecells via the exogenous DNA. In embodiments the protein is over-producedvia the exogenous DNA in amounts that substantially exceed the amountproduced in its absence. In embodiments the protein is produced inamounts that are at least any of 1.2, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5,5.0, 6.0, 7.0, 8.0, 9.0, 10, 25, 50, 75, 100, 150, 200, 250, 300, 400,500, 750, 1,000, 2,000, 3,000, 5,000, 7,500, 10,000 or more times asmuch as the amounts produced in the cells without the exogenous DNA.

Exogenous DNAs for Expressing Proteins

By exogenous DNA is meant a DNA that is not a naturally occurring DNA inits natural setting in a genome; that is, that it is not an unalteredendogenous gene in its unaltered endogenous setting. Typically,exogenous DNAs are DNAs introduced into cells via well known recombinantDNA techniques. Often the exogenous DNA encodes the protein to beexpressed, either in its natural form, as a mutein and/or as a fusionprotein. In embodiments in this regard exogenous DNAs are introducedinto cells in the form or expression vectors or constructs, as discussedin greater detail below. Exogenous DNA also may be an activator DNA thatdoes not encode the protein to be expressed, such as a RAGE constructthat acts by non-homologous recombination. It may be an activator DNAthat comprises only a portion of the gene for the protein to beexpressed, such as a construct for gene activation by homologousrecombination. It may be a construct that encodes the protein, such asan expression construct, in which case the coding region maybeuninterrupted for interrupted. And it may be other exogenous DNA thatengenders the production in the cells of desired amounts of one or moreproteins for an array.

Expression Vectors

In embodiments a protein for an array is expressed in cells via anexogenous DNA that is an expression vector (also referred to as anexpression construct) that encodes the protein and in which the codingsequence for the protein is operably linked to expression controlsequences (also referred to as cis-acting control sequences) thatprovide for the desired transcription and, ultimately, production of theprotein in a host cell. In embodiments expression vectors replicateautonomously, such as those that persist as episomal elements in cell.In embodiments expression vectors integrate into host cell DNA, such asthose that replicate with the host cell DNA. Any suitable expressioncontrol sequences can be used to produce proteins in cells. Suchexpression control sequences include but are not limited to promoters,enhancers, ribosome interaction sites, such as ribosome binding sites,polyadenylation sites, transcription splice sequences, transcriptiontermination sequences, sequences that stabilize mRNA, and othersequences that engender, regulate, facilitate, increase and/or achieve adesired effect on production of proteins via exogenous DNA in a hostcell. Such control sequences can be selected for host compatibility,inducible expression, high mRNA copy number, and other desirableeffects. In embodiments promoters useful in this regard include trp,lac, tac, or T7 promoters for bacterial hosts; alpha factor, alcoholoxidase, or PGH promoters for yeast, and MMTV; SV40; CMV, and RSV“promoters” for eukaryotic cells, such as mammalian cells.

An illustrative expression vector useful in certain embodiments of theinvention, pCMV6-entry, is shown schematically in FIG. 3.

Introduction of Exogenous DNA into Cells

Any suitable system or method can be used to introduce DNAs or otherpolynucleotides into cells for expression of proteins for making arraysin embodiments of the invention. There are many well known methods forintroducing DNAs and other polynucleotides that can be used in thisregard, such those described in the references listed further below.Among suitable methods described therein and elsewhere that can be usedin embodiments of the invention herein described are calcium phosphateprecipitation, electroporation, injection, DEAF-Dextran-mediatedtransfection, fusion with liposomes, association with agents whichenhance its uptake into cells, and viral transduction. Methods can beused for introducing DNAs and other polynucleotides into cells in which,after entry into the cell, the DNA (or other polynucleotide) persistsextra-chromosomally or integrated into a chromosome(s) of the host cell.The DNA, or other polynucleotide, can be transiently, constitutivelyand/or inducibly expressed, in accordance with well know methods. Wherethe polynucleotide introduced into the cell is not DNA, it will often bethe case that it is copied into DNA, which DNA ultimately is thetemplate for expressing the protein of interest.

Cells

As noted above, cells that express proteins of interest can be made byintroducing exogenous DNAs (or other polynucleotides) into host cells,selecting the cells that have taken up the DNA, clonally propagating thecells, confirming that they express the protein of interest, and thenstoring, and/or further expanding the cells to produce a sufficientpopulation of cells to make lysates sufficient for making desiredarrays. Suitable methods are well known and routine in the art, such asfor instance the methods set forth in the references on molecularcloning listed further below.

In embodiments proteins for making arrays can be made by in any suitablecell type, such as, without limitation prokaryotic cells or eukaryoticcells, including bacterial, plant or animal cells, yeast or mammaliancells, and human cells, such as COS, CV1, BHK, CHO, HeLa, LTK, NIH 3T3,293, and HEK293 cells, such as HEK293T cells.

Lysates

Lysates can be made from cells using any suitable method. In embodimentsthe lysate methods preserve desired structural and/or functionalfeatures of the proteins. Many such methods are well known to those ofskill. For instance, lysates can be made using detergentless buffers andbuffers with detergents, such as including RIPA buffer, lysis buffercontaining SDS, hypotonic lysis buffer and the like. Methods for lysingcells are well known in the art and include but are not limited todetergent lysis, sonication lysis, and lysis under pressure (FrenchPress) and the like.

In embodiments the concentration of proteins in each lysate in at least20, 30, 40, 50, 60, 70, 80, 90, 95 or 100% of the lysates in the arrayis least any of 1, 2, 3, 5, 10, 15, 20, 25, 35, 50, 100, 150, 200, 250,300, 350, 400, 450, 500, 600, 700, 800 or 900 micrograms/ml, 1, 2, 3, 5,10, 15, 20, 25, 35, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,600, 700, 800 or 900 mg/ml, or 1, 2, 3, 4, or 5 gm/ml.

In embodiments the concentration of proteins expressed via therecombinant DNAs in at least 20, 30, 40, 50, 60, 70, 80, 90, 95 or 100%of the lysates in the array is at least any of 0.01, 0.05, 0.10, 0.20,0.50, 0.75, 1.00, 2.00, 3.00, 4.00, 5.00, 10.0, 15.0, 20.0 percent ofthe total protein in the lysate.

In embodiments, lysate concentrations are 0.2-4 mg/ml and the proteinexpressed via an exogenous DNA is between 0.1 and 2% of the totalprotein.

In embodiments wherein recombinant proteins are purified beforeapplication to the array, the concentration of the recombinant proteinin the application buffer for at least 20, 30, 40, 50, 60, 70, 80, 90,95 or 100% of the proteins applied to the array is least any of 1, 2, 3,5, 10, 15, 20, 25, 35, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,600, 700, 800 or 900 micrograms/ml, 1, 2, 3, 5, 10, 15, 20, 25, 35, 50,100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800 or 900 mg/ml,or 1, 2, 3, 4, or 5 gm/ml.

Supports

Arrays can be made on any suitable support, whether in one part orseveral. In embodiments the solid support can be any material that is aninsoluble matrix and can have a rigid or semi-rigid surface. Inembodiments the support is a membrane, such as nitrocellulose, nylon,and the like, among other membrane materials suitable to act as supportsfor applying proteins to make arrays. Such membrane supports may be freestanding or may be themselves supported, such as nitrocellulose membranematerial on a glass slide. In embodiments supports may be glass, such asglass slides, silicon, including the surfaces of elements in integratedcircuits and MEMs devices, and plastics, including plastic plates, suchas microtiter plates, including, for instance, 96-well, 384-wellmicrotiter plates, as well as those of other capacities.

Exemplary solid supports include, but are not limited to, substratessuch as nitrocellulose (e.g., in membrane on a glass slide or inmicrotiter well form);

polyvinylchloride (e.g., sheets, on glass or in microtiter wells);polystyrene latex (e.g., bead, on glass or in microtiter plates);polyvinylidine fluoride; diazotized paper; nylon membranes; activatedbeads, magnetically responsive beads, etc. Particular supports includeplates, pellets, disks, capillaries, hollow fibers, needles, pins, solidfibers, cellulose beads, pore-glass beads, silica gels, polystyrenebeads optionally cross-linked with divinylbenzene, grafted co-polybeads, polyacrylamide beads, latex beads, dimethylacrylamide beadsoptionally crosslinked with N-N′-bis-acryloylethylenediamine, and glassparticles coated with a hydrophobic polymer.

In embodiments proteins are attached to supports via covalent and/ornon-covalent bonding. In embodiments proteins can be attached inunmodified form or they can be modified to facilitate attachment orremoval after attachment or both. In embodiments proteins can bemodified to facilitate or enable attachment to glass, polylysine,polystyrene, polyacrylate, polyimide, polyacrylamide, polyethylene,polyvinyl, polydiacetylene, polyphenylene-vinylene, polypeptide,polysaccharide, polysulfone, polypyrrole, polyimidazole, polythiophene,polyether, epoxies, silica glass, silica gel, siloxane, polyphosphate,hydrogel, agarose, cellulose, and/or other supports, coatings or films.

Among these are glass slides coated with nitrocellulose, such as SchottNexterion nitrocellulose slides.

Application of Lysates to Supports

Proteins, such as those in lysates expressed via an exogenous gene, canbe applied to arrays in a variety of ways. In embodiments they areapplied using a microarray printer. Microarray printers can bedifferentiated into three groups by their printing tip architecture andmechanisms for spotting samples: quill pins (split pins), piezoelectric(ink jet) spotters, and solid pins (Barbulovic-Nad et al., 2006). Thesolid pin arrayer developed by Aushon is specially designed for printingcomplex mixtures, such as cell lysate, and it works well with viscousprotein solutions to produce uniform spots on slides (Spurrier et al.,2008). Uniformity with this spotter is very good, as illustrated in FIG.2A, which shows an array stained with colloidal gold. Total protein inthe spots is uniform across the array, and the concentration series ineach sub-array show appropriate scaling, which likewise is uniformacross the array.

Array Geometry and Spot Density

Arrays can be made in a wide variety of formats, sizes, modulaity andcan be made with a wide variety of positions, proteins, features,feature sizes, feature spaces, feature occupancy, controls, alignmentmarkers and references among others.

In embodiments there are at least any of 10, 25, 50, 75, 100, 150, 200,250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500,10,000, 15,000, or 20,000 positions per cm² in the arrays.

In embodiments there are at least any of 10, 25, 50, 75, 100, 150, 200,250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500,10,000, 15,000, or 20,000 lysates applied to different positions on thearray per cm². In embodiments there are at least any of 10, 25, 50, 75,100, 150, 200, 250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000,6,000, 7,500, 10,000, 15,000, or 20,000 proteins expressed via anexogenous DNA applied to different positions on the array per cm².

In embodiments there are at least any of 10, 25, 50, 75, 100, 150, 200,250, 350, 500, 750, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,500,10,000, 15,000, or 20,000 positions per cm² on the arrays with saidproteins applied to at least any of 50, 60, 70, 80, 90 or 95% of saidpositions.

In embodiments the lysates and/or proteins are applied in spots(features) that are any of 10-50, 25-75, 50-100, 75-150 100-200,150-250, 200-300, 250-350, 300-400, 400-500, 500-750, 400-800, 750-1,000um in diameter.

In embodiments area of the features comprising lysates or proteins inthe array are any of 10-50, 25-75, 50-100, 75-150 100-200, 150-250,200-300, 250-350, 300-400, 400-500, 500-750, 400-800, 750-1,250,1,000-2,000, 1,500-3,000, 2,500-5,000 μm²

In embodiments the center to center spacing of the features (spots) inthe array is any of 5-15, 10-20, 15-25, 20-40, 25-50, 25-75, 50-100,75-150, 100-150, 125-175, 150-225, 200-250, 225-275, 250-350, 300-400 or400-500 um.

In embodiments the protein spot size is 110 to 300um in diameter. Inembodiments center to center spacing or positions and/or proteins on thearray is 150-250 um.

Using Arrays to Detect Binding

Any suitable methodology can be used for detecting and/or measuringbinding of agents to proteins in proteins arrays. In embodiments solidphase assays are used. In embodiments sandwich assays are used. Inembodiments radiometric, colorimetric, chemiluminescence and/orfluorimetric based assays are used. In certain embodiments relating toantibody binding assays, for instance, any suitable immunoassay can beused, including for instance RIAs (radioimmunassays), ELISAs,(enzyme-linked-immunosorbent assays), EIAs (enzyme-immunoasays),immunofluorescence assays, and immunoprecipitation assays, and the like.In embodiments direct labeling methods are used, in which agents aredirectly labeled and binding to proteins in the array is determined bydetecting and/or measuring the directly bound label. In embodimentsindirect labeling methods are used in which binding of agents isdetected by interaction with a detection moiety that is not part of theagent and not part of the protein on the array. For instance, inindirect ELISAs binding of an antibody to a protein in the array isdetected by a labeled secondary antibody that binds to the firstantibody. Colorimetric, radiometric and fluorimetric detectable markersthat are useful in embodiments include but are not limited to rhodamineor rhodamine derivative, biotin, avidin, strepavidin, a fluorescentcompound, such as Cy3, Cy5, Alexa-555, Alexa-647, Dylight-549 orDylight-649a, chemiluminescent compound, such as dimethyl acridiniumester, and the like.

In embodiments enzyme-immuno assays can be used for detection. A varietyof such assays are well known and routinely employed in the art thatreadily can be applied to protein arrays, such as those described in forexample VoHer, A., “The Enzyme Linked Immunosorbent Assay (ELISA),”1978, Diagnostic Horizons 2, 1-7, Microbiological Associates QuarterlyPublication, Walkersville, Md.; Voller, A. et al., 1978, J. Clin.Pathol. 31, 507-520; Butler, J. E., 1981, Meth. Enzymol. 73, 482-523;and Maggio, E. (ed.), 1980, Enzyme Immunoassay, CRC Press, Boca Raton,Fla,.

ELISAs utilize enzymatic reactions to produce colored (absorbent) orfluorescent products from colorigenic or fluorigenic substrates, orluminescence from chemiluminescent substrates. One or more enzymes maybe employed. In a simple implementation an enzyme that acts on achromogenic substrate is conjugaetd to an antibody. The conjugaete isincubated with, for instance, proteins immobilized in microtiter dishwell. After incubation conjugate that has not bound to proteins in thearray is washed away. Signal generating, such as a chromogenic substrateis added incubated in the wells for a period of time to allow any enzymeconjugaete bound to the protein in the microtiter plate well to generatethe colored product. In the linear regime of the reaction, the amount ofcolor produced is proportional to the amount of bound conjugate. Forprotein arrays, the product of the reaction generally will eitherprecipitate onto or bind the surface, so that it does not diffuse awayfrom locations where antibody is bound. The use of an enzymatic reactiongreatly amplifies the signal from each binding event. ELISAs can employtwo or more enzymes for additional amplification. A very wide variety ofELISAs are known to the art and readily can be adapted to use withprotein arrays as herein described.

Many enzymes have been used successfully in ELISAs that can be employedin embodiments, including but not limited to: malate dehydrogenase,staphylococcal nuclease, delta-5-steroid isomerase, yeast alcoholdehydrogenase, .alpha.-glycerophosphate, dehydrogenase, triose phosphateisomerase, horseradish peroxidase, alkaline phosphatase, asparaginase,glucose oxidase, .beta.-galactosidase, ribonuclease, urease, catalase,glucose-6-phosphate dehydrogenase, glucoamylase andacetylcholinesterase.

Any suitable substrate and label for these enzymes (and others) can beused in ELISAs, such as but not limited to (as mentioned above)colorigenic, fluorigenic, biolumingenic and chemilumigenic substrates,which give rise, respectively to colored, fluorescent andchemiluminescent products. Radiolabeling also can be used. Fluorescentlabels useful in embodiments include but are not limited to thefollowing; fluorescein isothiocyanate, rhodamine, phycoerythrin,phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine.Chemiluminescent labels useful in embodiments include but are notlimited to luminol, isoluminol, theromatic acridinium ester, imidazole,acridinium salt and oxalate ester. Bioluminescent labels useful inembodiments include but are not limited to luciferin, luciferase andaequorin. Any suitable label can be employed, and the foregoing aremerely some of the better known and more effective labels that have beendeveloped and employed in ELISAs and other binding assays that can beeffective in this regard.

Using Protein Arrays to Screen for Binding Partners

The interaction of any type of sample or substance (referred to hereinas agents) that reacts with or binds to a polypeptide (protein) in anarray can be identified, In embodiments arrays are used to detectbinding to proteins in arrays, such as the binding of samples (andcomponents thereof) or agents, such as candidate binding compounds orantibodies. Illustrative uses in this regard are described below.

In embodiments the agents are specific-binding partners of proteins inan array, such as antibodies, receptor ligands, aptamers, polypeptides,and other binding molecules. Agents can be enzymes or other substancesthat modify polypeptides, such as kinases (which phosphorylateproteins). Agents can comprise any substance or moiety that can bind aprotein (polypeptide) including, but not limited to, chemical compounds;biomolecules, such as polypeptides (amino acids), lipids, nucleic acids(nucleotides and polynucleotides), and carbohydrates; inorganicmolecules; organic molecules and the like, alone or combined.

Using Protein Arrays to Identify Proteins to Which Antibodies Bind

In embodiments protein arrays as described herein can be used to screenfor and identify the proteins to which antibodies bind. Antibodies,because of their specificity, are widely employed for diagnostic andtherapeutic purposes. Because of limitations in current assays methods,the antigens with which these antibodies interact are not entirelyknown.

When antibodies are generated against antigens, the resulting antibodiesare generally characterized as being specific for that antigen. In thecase of protein antigens, the proteins typically comprise one or moreepitopes to which individual anti-bodies bind specifically. Epitopes inproteins may be formed not only by continuous portions of the proteinbut also by discontinuous regions of the protein that are folded intoproximity in the protein's three dimensional conformation. The bindingspecificity of an antibody such as to a protein—can be complicated bycross-reaction to other proteins that contain the same or similarepitopes. Cross-reactivity can be a significant problem both whenantibodies are used for analytical and therapeutic purposes.

Sometimes such cross-reaction arises because an amino acid sequence thatdefines an epitope occurs in different proteins. This can occur indifferentially-spliced variants of the same primary transcript orbecause, simply, the sequence occurs in two proteins independently.Cross-reacting proteins may occur in the same cell and tissue types, andin different types, such as occurs when splice variation occurs in atissue-specific manner.

In general, it is important to understand the specificity of an antibodyfor use in detection assay, such as a diagnostic assay, or in atherapeutic. For many purposes it is important to characterize theantibody's interaction with its specific target and itscross-reactivity. Global understanding of antibody cross reactivity hasnot been readily obtainable using current technology, because there hasnot been any way to determine the interaction of an antibody with theproteome in whole. Much the same is true for the interaction of otherproteins (and non-protein agents) that bind to protein partners.

In embodiments herein described, protein arrays can be used to screenantibodies for cross-reaction to proteins other than the primary antigenand, in particular, to gain an understanding of interactions with asubstantial fraction of the proteins encoded in a given genome. Inembodiments in this regard genome wide arrays as described herein areparticularly useful. Much the same is true for other types of agentsthat bind proteins.

Using Protein Arrays to Identify Biomarkers of Disease

In certain embodiments protein arrays described herein can be used todetect antibodies produced by autoimmune diseases and by other diseases,such as cancers. In autoimmune conditions, subjects generate an immuneresponse against self-antigens, and these antibodies often are usefulmarkers for disease diagnosis and prognosis. Sometimes the cognateantigen for such auto-antibodies is not known, in which case, inembodiments protein arrays as described herein can be used to determinethe proteins to which such protein-binding auto-antibodies bind. inother cases, cognate antigens are known, at least in part, and proteinarrays as described herein can be used to determine, characterize and/ormeasure the auto-antibodies.

For antibodies that bind known or unknown proteins, in embodimentsprotein arrays as described herein can be used to determine the absence,the presence and/or the amount of such auto-antibodies. For instance, inaccordance with certain embodiments of the invention in this regard,antisera, blood components, fluids, and/or cells (to name a few) fromsubjects, such as those at risk for or actually suffering fromautoimmune conditions, can be applied to protein arrays as hereindescribed, to determine the target antigen (proteins) to which theybind, and/or to determine the absence, presence and/or the amount ofauto-antibodies in the samples that bind to particular proteins in thearray, so as to characterize the auto-immune antibodies in the sampleand thereby diagnosis health, risk or actual disease or the like in thesubject.

Similarly, in accordance with certain embodiments of the invention inthis regard, antisera, blood components, fluids, and/or cells (to name afew) from subjects, such as those at risk for or actually suffering fromdiseases that engender the production of antibodies not generally foundin the absence of the disease, can be applied to protein arrays asherein described, to determine the target antigen (proteins) to whichthey bind, and/or to determine the absence, presence and/or the amountof auto-antibodies in the samples that bind to particular proteins inthe array, so as to characterize the antibodies in the sample andthereby diagnosis health, risk or actual disease or the like in thesubject.

The foregoing description and the examples below illustrate variousembodiments of the inventions herein disclosed. It is to be appreciatedthat a wide variety of additional aspects, features and embodiments willbe apparent from reading the disclosure to those skilled in the artspertaining thereto, which are all within the scope of the inventionsherein disclosed.

V

The following examples describe illustrative embodiments of proteinmicroarrays in accordance with various aspects of inventions hereindescribed and a few illustrative applications of them. These examplesare in no way limitative of the inventions herein described.

Example 1 Protein Microarray with Approximately 50% Coverage of HumanProtein Genes

Embodiments of the invention provide arrays with substantial fractionsof all of the proteins coding genes in a genome. The International HumanGenome Sequencing Consortium estimates that there are about20,000-25,000 protein-coding genes in the human genome (Stein, 2004).The following examples illustrate the production and use of proteinarrays with approximately 10,000-20,000 spots containing fromapproximately 3,500-10,000 individual human genes. The arrays wereproduced using OriGene, Inc. libraries of validated human cDNAs clonedinto the mammalian expression vector pCMV6-entry as described below.

Example 2 pCMV-entry Vector for Expressing the Human Proteins

Proteins for the arrays were expressed via a pCMV-entry expressionvector, schematically depicted in FIG. 3. The vector has severalfeatures that make it especially effective for overexpressing mammalianproteins in mammalian host cells for making protein arrays. It comprisesan origin of replication effective for efficient episomal replication ineukaryotic cells (SV40 On), and an origin for replication in bacteria.It comprises an expression cassette for convenient cloning and efficientexpression in mammalian cells, comprising a CMV promoter, multicloningregions, and a polyadenylation signal. The expression cassette alsoincludes myc and DDK epitopes just upstream of the polyadenylationsignal for expressing C-terminal myc-DDK tagged proteins. The tags areeffective and convenient moieties for detecting and purifying therecombinantly expressed proteins. The vector comprises a T7 promoterupstream of the multicloning regions for efficient transcription inbacterial hosts (and in vitro). It comprises a second expressioncassette for expressing drug resistance markers for selection inmammalian and bacterial host cells (kanamycin and neomycein resistancegenes, respectively). And it comprises C-terminal myc and DDK tagsequences in 3′ region of the CMV expression of tagged fusion proteins.It comprises an origin of replication for propagation in bacterial cellsas well. (FLAG is a proprietary name for DDK.)

Example 3 Over-Expression Lysates

More than 12,000 over-expression lysates have been made using humancDNAs cloned into the pCMV-entry vector and over-expressed in HEK293Tcells, and they have been validated by anti-Flag immunoblot analyses.Expression profile analyses also showed that proper posttranslationalmodifications occur in this expression system. The expression profilesfor all the lysates were examined and annotated individually. FIG. 2shows expression profiles for 8 randomly chosen lysates.

Expression levels for most of the recombinant proteins are at least 100times higher than its endogenous counter-partner, illustrated in FIG. 2.This level of over-expression provides an extremely high signal to noiseratio relative to the background from the host cells themselves.

The overall success rate for anti-Flag immunoblot is around 95%.

Example 4 Printing Over-expression Lysates on Nitrocellulose Slides

Overexpression lysates were printed on a Schott nitrocellulose slide.FIG. 1 shows an overall layout and subarray specifications. As shown inthe figure, each slide was divided into subarrays, typically 40subarrays per slide. The enlarged area in the figure shows the layout ofeach subarray. As indicated in the figure each subarray contained thefollowing controls and markers: purified BSA-cy3 and BSA-cy5 orientationmarkers; purified mouse and rabbit IgG (positive controls); lysates ofHEK293T cells transfected with empty pCMV-entry vector DNA (negativecontrol), reference dilution series of purified GST-myc-Flag fusionproteins (to establish reference concentration curves for quantifyingexogenous recombinant protein expression in the subarray lysates). Thesignal from the GST-myc-FLAG concentration series served to establish astandard curve of signal intensity vs. concentration for determiningmys-FLAG tagged protein expression.

Arrays were made on Schott nitrocellulose standard microarray slides,using pin spotters, in particular an Aushon 2470 array spotter. 9.000spots (features), 200-300 pico liters each, were printed on 21 mm×51 mmnitrocellulose pads on the standard slides using 110 um pins. 16,000spots were printed with 85 um pins. As many as 22,000 150-200 pico literspots were printed on the slides using an 85 um pin on a somewhat largernitrocellulose pad (21 mm×60 mm). In keeping with ambient analytetheory, detection sensitivity increased with decreasing spot size(Ekins, 1989). Uniformity of signal was greater for 85 um spots than 110um spots, while signal intensity was about the same. Details of arrayfabrication are set out below.

-   Slide type: Schott NC slide-   Aushon arrayer pin size 85 um-   Slide NC pad dimension: 21 mm×60 mm-   Total subarrays: 48 (4 columns×12 rows)-   Subarray Size: 4200 um×4400 um-   Subarray Dimensions: 21 columns (Horizontal)×22 rows (Vertical)-   Median, Spot Diameter: ˜150 um-   Spot Center to Center Spacing: 200 um-   Distances between Subarrays: 200 um-   Replicates per Sample: 2

Slide made to these specifications can comprise 22,176 features (spots),such as duplicates of 10,464 unique proteins spots (such as lysates) and1,248 control features.

Lysates in RIPA buffer containing 1% NP-40 were spotted onto thenitrocellulose slides using a solid pin spotter. Other solutions can beused for printing such as those that contain other detergents orchaotropic reagents, such as those described in Chan et al., 2004 andNishizuka et aL, 2003. Lysates were spotted directly from source platesand all spotting was carried out in a controlled environment at 70%relative humidity to minimize evaporative effects on sampleconcentrations. This worked well for printing arrays of approximately9,000 spots.

For larger arrays, the fabrication time can be kept the same by spottingfrom several source plates, instead of one, serially or in parallel. Tofurther minimize evaporative effects on concentration during arrayfabrication the lysates can be distributed into several source platesand spotted in parallel, so that none of the samples is exposed so longthat it is adversely affected. Alternatively or additionally, 4%glycerol (or other stabilizing and anti-evaporative agents) can be addedto the spotting buffer.

Example 5 Evaluation of Chip Quality

Quality of the protein arrays was evaluated by staining with colloidalgold to evaluate total protein and by anti-Flag immunostaining toevaluate recombinant protein Colloid gold staining shows high uniformityof spot morphology for total protein across the arrays, illustrated inFIG. 3A. Immunostaining with anti-FLAG antibody showed binding ofFLAG-myc fusion proteins across the array, seen in FIGS. 3B and 5A.Variability in amount of anti-FLAG binding reflects differences infusion protein expression in the host cells and consequent differencesin concentrations in the lysates. Purified GST-myc-Flag fusion proteinwas applied directed to the arrays to serve as a reference standard fordetermining concentrations of fusion proteins in lysates. The dilutionseries is graphically illustrated in the enlarged area of the array inFIG. 1, and uniformity of the dilution series can be seen in the upperright panel in FIG. 4C

Example 6 Identifying Antibody Specificities Using Lysate Arrays

Antibody specificity often is critical for molecular biology researchand therapeutic antibody development, as well a crucial feature for thediagnostic and therapeutic use of antibodies. Cross-reactivities cancause false positive for biological research and side effects fortherapeutic antibody treatment as described in Tabrizi et al., (2009)for instance. Embodiments of inventions herein disclosed include the useof overexpression lysate microarrays for identifying and validatingantibody specificity, including identifying and/or characterizingprimary specificities and cross-reactivities of antibodies. The arraysfurthermore can be used to investigate, identify and validate thebinding specificities and cross-reactivities of other proteins, as wellas other types of molecules.

By way of illustration in this regard, a previously characterizedpolyclonal antibody for p53 was screened against a lysate arraycomprising over 3700 overexpression lysates comprising 3700 distinctmyc-FLAG human fusion proteins. The p53 antibody reacted both with thep53 expressing lysate and with endogenous p53 of the HEK293T host cells.The signal from the overexpression lysate was more than 10 times greaterthan the signal from background HEK293T expression.

In addition, the antibody bound to several other lysates at levelsnotably above background, but less than the binding to the p53 lysate.Further examination showed that the higher p53 signals in several ofthese lysates was due to stimulation of p53 expression by the exogenousprotein rather than cross-reactivity of the exogenous protein to theanti-p53 antibody. Similar results were obtained using a mousemonoclonal anti-p53 antibody.

The results, illustrated in FIG. 4 show that over-expression lysatearrays can be used to study protein interaction specificity andcross-reactivity and to determine effects of overexpression of a largenumber of proteins (individually and/or in concert with one another) onexpression of other proteins, particularly, for instance, endogenousproteins.

Example 7 Decoding Monoclonal Antibodies Generated by Whole CellImmunization

Whole cell immunization can be used to generate monoclonal antibodies,such as highly specific monoclonal antibodies for biomarker assays andcancer therapy. However, wider use of whole cell immunization techniquesis hampered by the difficulty of determining the targets of themonoclonal antibodies that are initially obtained. Often this task isvery expensive and takes years to carry out.

In embodiments overexpression lysates microarray chips can be used toquickly determine the protein-binding targets of protein binding agents,such as the cellular protein binding specificities of monoclonalantibodies generated by whole cell immunization techniques. Determiningthe binding specificity or specificities of binding partners, such asantibodies, is referred to herein as decoding.

Embodiments in this regard are illustrated by identification of thetargets of a commercially available anti-E-cadherin antibody that wasinitially obtained by immunizing mouse with MCF-7 mammary carcinomacells (Shimoyama et al., 1989). Immunostaining data show that the targetstands out clearly from over 3700 different genes (FIG. 5A). Theconclusion is further supported by western blot analysis (FIG. 5B).

Example 8 Tumor Biomarker Discovery

A variety of proteins serve as disease indicators and surrogate endpoints for developing therapeutics. Auto-antibodies produced by patientswith cancer against tunors represent a class of proteins that couldprove valuable diagnostic and prognostic indicators of disease. Thepossibility such proteins represent has not been realized, in partbecause of the difficulty of characterizing auto-antibodies in humansera.

A variety of methods have been developed in hopes of overcoming thisdifficulty, such as SEREX (Serological Identification of Antigens byRecombinant Expression Cloning) and SERPA (Serological ProteomeAnalysis) (Gunawardana and Diamandis, 2007); but, they all havesubstantial disadvantages.

SEREX can provide wide breath of coverage with clear annotation for eachclone; but, it is based on the prokaryotic cDNA expression libraryscreening. As a result the recombinant proteins used for screening donot have any posttranslational modifications. Moreover, the technologymakes it difficult to study large number of patient serum samples atdiscovery stage.

SERPA identifies auto-antibody targets by immunoblotting and MS. Inessence, sera containing auto-antibodies is used as a probe to detectcognate antigens in human tissue lysates subjected to 2-D IEF/SDS PAGE.Protein antigens in the gel that bind auto-antibodies in the sera thenare identified by mass spectroscopy. The technique subjects proteins toharshly denaturing conditions, and suffers from detection insensitivityand irreproducibility. In addition, it can be difficult to identifyantigens from the limited data that the technique provides—IEP, size anda mass spectrum, typically contaminated by other proteins involved incarrying out the western blot and detection steps.

Embodiments herein described overcome the limitations of methods such asSEREX and SERPA for identifying and characterizing protein antigens ofauto-antibodies. In embodiments, the proteins arrays have a clearannotation for each gene, so that the proteins are known at eachlocation in the array. Moreover, in embodiments the proteins in thearrays are produced and well processed post-translationally in HEK293Texpression system.

Such embodiments are illustrated by the identification ofauto-antibodies in breast cancer patients. Microarray slides wereincubated with sera from breast cancer patients or from age matchedhealthy patients and then immunostained to visualize whereauto-antibodies in the patient sera bound proteins in the microarrays.The results reveal a distinct autoantibody immunoreactive pattern fordifferent human serum, illustrated in FIG. 6.

Example 9 Purified Protein Arrays—One Step Immunopurification

Previous research has shown that anti-Flag immunoaffinity purificationtechnology can be used to isolate Flag-tagged multiple subunit proteincomplexes from overexpression lysates under native conditions (Chiang etal., 1993; Gloeckner et al., 2009). We applied this approach to isolateFLAG-myc tagged proteins expressed using the pCMV6-entry vector inHEK2093T cells. The general approach is depicted schematically in Tables2 and 3 and results for 10 randomly chosen lysates are shown in FIG. 7.The approach can be used with any epitope tags, including, but notlimited to His, myc, FLAG, V5, GST, T7, HSV, VSV-g, Glu-Glu, HA, E-tagand others.

Example 10 On Chip Purification

On-chip purification can be used to produce microarrays as describedherein. A general scheme for making protein arrays using on-chippurification is depicted in Table 3 and FIG. 8.

This example describes the production of a protein array with 10,464purified human recombinant proteins using a DDK epitope and anit-DDKantibody (FLAG epitope and snit-FLAG antibody). Flag is a highlyimmunogenic peptide. The interaction between Flag epitope tag andanti-Flag antibody is exceptionally strong and specific (Chiang andRoeder, 1993). High quality anti-Flag antibodies have been produced indifferent species, including mouse, rabbit, goal and even chicken. Suchantibodies are available commercially, such as those from OriGene, Inc.,which offers high quality anti-Flag mouse monoclonal and rabbitpolyclonal antibodies, often used for immunoprecipitation analysis.

Efficacy of the on-chip purification is depicted in FIG. 9. HEK293T celllysates comprising FLAG-myc tagged proteins expressed in the HEK293Tcells via the pCMV6-entry vector or lysates of empty pCMV6-entry vectortransformed cells (negative controls) were spotted onto uncoatednitrocellulose slides (negative control) and nitrocellulose slidescoated with anti-FLAG antibody. Slides were then probed with anit-mycantibodies to visualize immobilized FLAG-myc tagged proteins or withanti-beta-actin antibodies to visualize actin, representing untaggedcellular protein.

As illustrated in FIG. 9, anti-myc antibody binding revealed thatmyc-FLAG tagged proteins bound to the anti-FLAG coated nitrocelluloseslides in a tight, relatively uniform, densely staining spot, whereas itbound to the uncoated slides in a broader, more diffuse and less densespot. There was no staining of the negative controls by anti-mycantibody on either type of slide. There was no anti-beta actinimmunostaining of the spots on the anti-FLAG coated slide, showing thatblocking step was effective to prevent non-specific binding and thatbulk cellular protein was efficiently washed away after spotting thelysate on the slide. Anti-myc antibody binding to the uncoated slidesshowed binding of bulk cellular protein in a diffuse spot with a darkannulus.

The on-chip one step immunoaffinity purification can be used with anytagged proteins and thus can be applied broadly to proteins produced viaan exogenous DNA using a vector that expresses tagged fusion proteins. Avariety of tags can be used in the same way as DDK (FLAG) for thispurpose.

VII

The following references and those cited elsewhere herein are expresslyincorporated herein in their entireties, particularly as to the specificsubject for which they are referenced herein.

Information on polynucleotides, expression of exogenous genes,expression vectors, protein production in transformed cells which may beuseful n carrying out embodiments of the invention is well known andwidely available in the arts to which embodiments pertain. Suchinformation may be found in, for instance, in the following references.

Hames et al., Polynucleotide Hybridization, IRL Press, 1985.

Davis at al., Basic Methods in Molecular Biology, Elsevir SciencesPublishing, Inc., New York, 1986.

Sambrook et al., Molecular Cloning, 3^(rd) Ed. CSH Press, 2001.

Howe, Gene Cloning and Manipulation, Cambridge University Press, 1995.

Ausubel et al., Current Protocols in Molecular Biology, John Wiley &Sons, Inc., 1994 to the present

Additional information, noted in citations herein, may be found in thefollowing references.

Barbulovic-Nad, I., Lucente, M., Sun, Y., Zhang, M., Wheeler, A. R., andBussmann, M. (2006). Bio-microarray fabrication techniques—a review.Crit Rev Biotechnol 26, 237-259.

Chan, S. M., Ermann, J., Su, L, Fathman, C. G., and Utz, P. J. (2004).

Protein microarrays for multiplex analysis of signal transductionpathways. Nat Med 10, 1390-1396.

Cheadle, C., Vawter, M. P., Freed, W. J., and Becker, K. G. (2003).Analysis of microarray data using Z score transformation. J Mol Diagn 5,73-81.

Chiang, C. M., Ge, H., Wang, Z., Hoffmann, A., and Roeder, R. G. (1993).Unique TATA-binding protein-containing complexes and cofactors involvedin transcription by RNA polymerases II and III. Embo J 12, 2749-2762.

Chiang, C. M., and Roeder, R. G. (1993). Expression and purification ofgeneral transcription factors by FLAG epitope-tagging and peptideelution. Pept Res 6, 62-64.

Ekins, R. P. (1989). Multi-analyte immunoassay. J Pharm Biomed Anal 7,155-168.

Gloeckner, C. J., Boldt, K., Schumacher, A., and Ueffing, M. (2009).Tandem Affinity Purification of Protein Complexes from Mammalian Cellsby the Strep/FLAG (SF)-TAP Tag. Methods Mol Biol 564, 359-372.

Goshima, N., Kawamura, Y., Fukumoto, A., Miura, A., Honma, R., Satoh,R., Wakamatsu, A., Yamamoto, J., Kimura, K., Nishikawa, T., et al.(2008). Human protein factory for converting the transcriptome into anin vitro-expressed proteome. Nat Methods 5, 1011-1017.

Guilleaume, B., Buness, A., Schmidt, C., Klimek, F., Moldenhauer, G.,Huber, W., Arlt, D., Korf, U., Wiemann, S., and Poustka, A. (2005).Systematic comparison of surface coatings for protein microarrays.Proteomics 5, 4705-4712.

Gunawardana, C. G., and Diamandis, E. P. (2007). High throughputproteomic strategies for identifying tumour-associated antigens. CancerLett 249, 110-119.

Haab, B. B. (2006). Applications of antibody array platforms. Curr OpinBiotechnol 17, 415-421.

He, M., Stoevesandt, 0., Palmer, E. A., Khan, F., Ericsson, 0., andTaussig, M. J. (2008). Printing protein arrays from DNA arrays. NatMethods 5, 175-177.

Hultschig, C., Kreutzberger, J., Seitz, H., Konthur, Z., Bussow, K., andLehrach, H. (2006). Recent advances of protein microarrays. Curr OpinChem Biol 10, 4-10.

Husi, H., and Grant, S. G. (2001). Isolation of 2000-kDa complexes ofN-methyl-D-aspartate receptor and postsynaptic density 95 from mousebrain. J Neurochem 77, 281-291.

Ikura, T., Ogryzko, V. V., Grigoriev, M., Groisman, R., Wang, J.,Horikoshi, M., Scully, R., Qin, J., and Nakatani, Y. (2000). Involvementof the TIP60 histone acetylase complex in DNA repair and apoptosis. Cell102, 463-473.

LaBaer, J., and Ramachandran, N. (2005). Protein microarrays as toolsfor functional proteomics. Curr Opin Chem Biol 9, 14-19.

Li, A. G., Piluso, L. G., Cai, X., Gadd, B. J., Ladurner, A. G., andLiu, X. (2007). An acetylation switch in p53 mediates holo-TFIIDrecruitment. Mol Cell 28, 408-421.

MacBeath, G., and Schreiber, S. L. (2000). Printing proteins asmicroarrays for high-throughput function determination. Science 289,1760-1763.

Spurner, B., Honkanen, P., Holway, A., Kumamoto, K., Terashima, M.,Takenoshita, S., Wakabayashi, G., Austin, J., and Nishizuka, S. (2008).Protein and lysate array technologies in cancer research. Biotechnol Adv26, 361-369.

Spurrier, B., Washburn, F. L., Asin, S., Ramalingam, S., and Nishizuka,S. (2007). Antibody screening database for protein kinetic modeling.Proteomics 7, 3259-3263.

Stein, L. D. (2004). Human genome: end of the beginning. Nature 431,915-916.

Stornaiuolo, M., Lotti, L. V., Borgese, N., Torrisi, M. R., Mottola, G.,Martire, G., and Bonatti, S. (2003). KDEL and KKXX retrieval signalsappended to the same reporter protein determine different traffickingbetween endoplasmic reticulum, intermediate compartment, and Golgicomplex. Mol Biol Cell 14, 889-902.

Tabrizi, M. A., Bornstein, G. G., Klakamp, S. L., Drake, A., Knight, R.,and Roskos, L. (2009). Translational strategies for development ofmonoclonal antibodies from discovery to the clinic. Drug Discov Today14, 298-305.

VanMeter, A., Signore, M., Pierobon, M., Espina, V., Liotta, L. A., andPetricoin, E. F., 3rd (2007). Reverse-phase protein microarrays:application to biomarker discovery and translational medicine. ExpertRev Mol Diagn 7, 625-633.

Zhu, H., Bilgin, M., Bangham, R., Hall, D., Casamayor, A., Bertone, P.,Lan, N., Jansen, R., Bidlingmaier, S., Houfek, T., et al. (2001). Globalanalysis of protein activities using proteome chips. Science 293,2101-2105. Monneret, C. (2005). Histone deacetylase inhibitors. Eur JMed Chem 40, 1-13.

Nabholtz, J. M., Reese, D. M., Lindsay, M. A., and Riva, A. (2002).HER2-positive breast cancer: update on Breast Cancer InternationalResearch Group trials. Clin Breast Cancer 3 Suppl 2, S75-79.

Nishizuka, S., Charboneau, L., Young, L., Major, S., Reinhold, W. C.,Waltham, M., Kouros-Mehr, H., Bussey, K. J., Lee, J. K., Espina, V.,etal. (2003). Proteomic profiling of the NCI-60 cancer cell lines usingnew high-density reverse-phase lysate microarrays. Proc Nati Acad Sci US A 100, 14229-14234.

Oldfield, C. J., Meng, J., Yang, J. Y., Yang, M. Q., Uversky, V. N., andDunker, A. K. (2008). Flexible nets: disorder and induced fit in theassociations of p53 and 14-3-3 with their partners. BMC Genomics 9 Suppl1, S1.

Payne, M. E., Fong, Y. L., Ono, T., Colbran, R. J., Kemp, B. E.,Soderling, T. R., and Means, A. R. (1988). Calcium/calmodulin-dependentprotein kinase II. Characterization of distinct calmodulin binding andinhibitory domains. J Bid Chem 263, 7190-7195.

Pelham, H. R. (1990). The retention signal for soluble proteins of theendoplasmic reticulum. Trends Biochem Sci 15, 483-486.

Ramachandran, N., Raphael, J. V., Hainsworth, E., Demirkan, G., Fuentes,M. G., Rolfs, A., Hu, Y., and LaBaer, J. (2008). Next-generationhigh-density self-assembling functional protein arrays. Nat Methods 5,535-538.

Schnack, C., Hengerer, B., and Gillardon, F. (2008). Identification ofnovel substrates for Cdk5 and new targets for Cdk5 inhibitors usinghigh-density protein microarrays. Proteomics 8, 1980-1986.

Sheng, Y., Saridakis, V., Sarkari, F., Duan, S., Wu, T., Arrowsmith, C.H., and Frappier, L. (2006). Molecular recognition of p53 and MDM2 byUSP7/HAUSP. Nat Struct Mol Biol 13, 285-291.

Shimoyama, Y., Hirohashi, S., Hirano, S., Noguchi, M., Shimosato, Y.,Takeichi, M., and Abe, O. (1989), Cadherin cell-adhesion molecules inhuman epithelial tissues and carcinomas. Cancer Res 49, 2128-2133.

1. A method for making a protein array, comprising applying lysates L₁through L_(n) comprising proteins P₁ through P_(n) to positions S₁through S_(n) on a support, wherein each lysate L_(x) is of cells C_(x),comprising protein P_(x) expressed therein via exogenous DNA D_(x) andis applied to position S_(x), wherein P₁ through P_(n) are all differentfrom one another, S₁ through S_(n) are all different from one another, nis an integer greater than 1 and x is an integer from 1 to n.
 2. Amethod for making a protein array, comprising applying proteins P₁through P_(n) to positions S ₁ through S_(n) on a support, wherein eachprotein P_(x) is expressed in cells C_(x) via exogenous DNA D_(x) and isapplied to position S_(x), wherein P₁ through P_(n) are all differentfrom one another, S₁ through S_(n) are all different from one another, nis an integer greater than 1 and x is an integer from 1 to n.
 3. Amethod according to claim 1, wherein the proteins comprise at least1,000 different loci of an organism.
 4. A method according to claim 1,wherein the proteins collectively comprise at least 20 percent of theproteins encoded by the genome of an organism.
 5. A method according toclaim 1, wherein the genome is a human genome.
 6. A method according toclaim 1, wherein for at least 50 percent of said lysates, the amount oflysate protein applied to each position in the array is the amount oftotal protein in at least 100 cells of the lysate.
 7. A method accordingto claim 1, wherein the proteins or the lysates are applied tonitrocellulose on a glass slide.
 8. A method according to claim 1,wherein the support is coated with a capture reagent specific for anaffinity tag and the proteins comprise the tag and are purified bybinding to the capture reagent.
 9. A protein array, comprising lysatesL₁ through L_(n) comprising proteins P₁ through P_(n) at positionsS₁through S_(n) on a support, wherein each lysate L_(x) is of cellsC_(x), comprising protein P_(x) expressed therein via exogenous DNAD_(x) and applied to position S_(x), wherein P₁ through P_(n) are alldifferent from one another, S₁ through S_(n) are all different from oneanother, n is an integer greater than 1 and x is an integer from 1 to n.10. A protein array, comprising proteins P₁ through P_(n) at positionsS₁ through S_(n) on a support, wherein each protein P_(x) is expressedin cells C_(x) via exogenous DNA D_(x) and applied to position S_(x),wherein P₁ through P_(n) are all different from one another, S₁ throughS_(n) are all different from one another, n is an integer greater than 1and x is an integer from 1 to n.
 11. A protein array according to claim9, wherein the proteins comprise at least 1,000 different loci of anorganism.
 12. A protein array according to claim 9, wherein the proteinscollectively comprise at least 20 percent of the proteins encoded by thegenome of an organism.
 13. A protein array according to claim 9, whereinthe genome is a human genome.
 14. A protein array according to claim 9,wherein for at least 50 percent of said lysates, the amount of lysateprotein applied to each position in the array is the amount of totalprotein in at least 100 cells of the lysate.
 15. A protein arrayaccording to claim 9, wherein the proteins or the lysates are applied tonitrocellulose on a glass slide.
 16. A protein array according to claim9, wherein the proteins comprise an affinity tag and are bound to thesupport by a capture reagent specific for the affinity tag immobilizedtherein.
 17. A method for determining the binding specificity of anantibody or antibody preparation comprising determining the binding ofthe antibody or antibody preparation to a protein array according toclaim 9 and from the determination identifying the binding specificityof the antibody or antibody preparation for proteins in the array.
 18. Amethod for determining protein biomarkers of disease, comprisingdetermining binding of samples from one or more healthy individuals andfrom one or more diseased individuals suffering from a disease to aprotein array according to claim 9, and from differences in the bindingof the samples from the healthy and diseased individuals determiningprotein biomarkers of the disease.
 19. A method for determiningbiomarkers of an autoimmune disease, comprising determining binding ofantibody-containing samples from one or more healthy subjects and fromone or more subjects suffering from an autoimmune disease to a proteinarray in according to claim 9, and from differences in the binding ofthe antibodies in the samples from the healthy subjects and the subjectssuffering from an autoimmune disease determining protein biomarkers ofthe autoimmune disease.
 20. A method for determining biomarkers of adisease characterized by the presence of antibodies not present inhealthy individuals, comprising determining binding of samples from oneor more healthy subjects and from one or more subjects suffering from adisease characterized by the presence of antibodies not present inhealthy individuals to a protein array according to claim 9, and fromdifferences in the binding of the antibodies in the samples from thehealthy subjects and the subjects suffering from the disease determiningprotein biomarkers of the disease.
 21. A method for diagnosing a diseasecharacterized by the presence of antibodies not present in healthyindividuals, comprising determining binding of an antibody containingsample from a subject possibly suffering from the disease to a proteinarray in according to claim 9 and from the binding of antibodies in thesample to the array determining the absence or the presence of thedisease.
 22. A method for monitoring signaling transduction pathways,comprising determining binding of a sample comprising proteins of signaltransduction pathway proteins to a protein array according to claim 9,whereby binding to proteins in the array is indicative of the absence,the presence and/or the amount of said proteins.
 23. A method fordetermining interactions between small molecules and proteins,comprising determining the binding of a sample comprising said smallmolecules to protein arrays according to claim 9.